NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Face-to-Face Contrastive Learning for Social Intelligence Question-Answering

https://doi.org/10.1109/FG57933.2023.10042612

Wilf, Alex; Ma, Martin Q.; Liang, Paul Pu; Zadeh, Amir; Morency, Louis-Philippe (January 2023, 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG))

Full Text Available
On Emergent Communication in Competitive Multi-Agent Teams

Liang, Paul Pu; Chen, Jeffrey; Salakhutdinov, Ruslan; Morency, Louis-Philippe; Kottur, Satwik (January 2020, Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems)

Several recent works have found the emergence of grounded com-positional language in the communication protocols developed bymostly cooperative multi-agent systems when learned end-to-endto maximize performance on a downstream task. However, humanpopulations learn to solve complex tasks involving communicativebehaviors not only in fully cooperative settings but also in scenar-ios where competition acts as an additional external pressure forimprovement. In this work, we investigate whether competitionfor performance from an external, similar agent team could actas a social influence that encourages multi-agent populations todevelop better communication protocols for improved performance,compositionality, and convergence speed. We start fromTask &Talk, a previously proposed referential game between two coopera-tive agents as our testbed and extend it intoTask, Talk & Compete,a game involving two competitive teams each consisting of twoaforementioned cooperative agents. Using this new setting, we pro-vide an empirical study demonstrating the impact of competitiveinfluence on multi-agent teams. Our results show that an externalcompetitive influence leads to improved accuracy and generaliza-tion, as well as faster emergence of communicative languages thatare more informative and compositional.
more » « less
Full Text Available
Found in Translation: Learning Robust Joint Representations by Cyclic Translations between Modalities

https://doi.org/10.1609/aaai.v33i01.33016892

Pham, Hai; Liang, Paul Pu; Manzini, Thomas; Morency, Louis-Philippe; Póczos, Barnabás (July 2019, Proceedings of the AAAI Conference on Artificial Intelligence)

Multimodal sentiment analysis is a core research area that studies speaker sentiment expressed from the language, visual, and acoustic modalities. The central challenge in multimodal learning involves inferring joint representations that can process and relate information from these modalities. However, existing work learns joint representations by requiring all modalities as input and as a result, the learned representations may be sensitive to noisy or missing modalities at test time. With the recent success of sequence to sequence (Seq2Seq) models in machine translation, there is an opportunity to explore new ways of learning joint representations that may not require all input modalities at test time. In this paper, we propose a method to learn robust joint representations by translating between modalities. Our method is based on the key insight that translation from a source to a target modality provides a method of learning joint representations using only the source modality as input. We augment modality translations with a cycle consistency loss to ensure that our joint representations retain maximal information from all modalities. Once our translation model is trained with paired multimodal data, we only need data from the source modality at test time for final sentiment prediction. This ensures that our model remains robust from perturbations or missing information in the other modalities. We train our model with a coupled translationprediction objective and it achieves new state-of-the-art results on multimodal sentiment analysis datasets: CMU-MOSI, ICTMMMO, and YouTube. Additional experiments show that our model learns increasingly discriminative joint representations with more input modalities while maintaining robustness to missing or perturbed modalities.
more » « less
Full Text Available
Towards Debiasing Sentence Representations

https://doi.org/10.18653/v1/2020.acl-main.488

Liang, Paul Pu; Li, Irene Mengze; Zheng, Emily; Lim, Yao Chong; Salakhutdinov, Ruslan; Morency, Louis-Philippe (January 2020, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics)

As natural language processing methods are increasingly deployed in real-world scenarios such as healthcare, legal systems, and social science, it becomes necessary to recognize the role they potentially play in shaping social biases and stereotypes. Previous work has revealed the presence of social biases in widely used word embeddings involving gender, race, religion, and other social constructs. While some methods were proposed to debias these word-level embeddings, there is a need to perform debiasing at the sentence-level given the recent shift towards new contextualized sentence representations such as ELMo and BERT. In this paper, we investigate the presence of social biases in sentence-level representations and propose a new method, Sent-Debias, to reduce these biases. We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks such as sentiment analysis, linguistic acceptability, and natural language understanding. We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
more » « less
Full Text Available
Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors

https://doi.org/10.1609/aaai.v33i01.33017216

Wang, Yansen; Shen, Ying; Liu, Zhun; Liang, Paul Pu; Zadeh, Amir; Morency, Louis-Philippe (July 2019, Proceedings of the AAAI Conference on Artificial Intelligence)

Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker intentions often vary dynamically depending on different nonverbal contexts, such as vocal patterns and facial expressions. As a result, when modeling human language, it is essential to not only consider the literal meaning of the words but also the nonverbal contexts in which these words appear. To better model human language, we first model expressive nonverbal representations by analyzing the fine-grained visual and acoustic patterns that occur during word segments. In addition, we seek to capture the dynamic nature of nonverbal intents by shifting word representations based on the accompanying nonverbal behaviors. To this end, we propose the Recurrent Attended Variation Embedding Network (RAVEN) that models the fine-grained structure of nonverbal subword sequences and dynamically shifts word representations based on nonverbal cues. Our proposed model achieves competitive performance on two publicly available datasets for multimodal sentiment analysis and emotion recognition. We also visualize the shifted word representations in different nonverbal contexts and summarize common patterns regarding multimodal variations of word representations.
more » « less
Full Text Available
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence

Zadeh, Amir; Chan, Michael; Liang, Paul Pu; Tong, Edmund; Morency, Louis-Philippe (January 2019, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR))

As intelligent systems increasingly blend into our everyday life, artificial social intelligence becomes a prominent area of research. Intelligent systems must be socially intelligent in order to comprehend human intents and maintain a rich level of interaction with humans. Human language offers a unique unconstrained approach to probe through questions and reason through answers about social situations. This unconstrained approach extends previous attempts to model social intelligence through numeric supervision (e.g. sentiment and emotions labels). In this paper, we introduce the Social-IQ, an unconstrained benchmark specifically designed to train and evaluate socially intelligent technologies. By providing a rich source of open-ended questions and answers, Social-IQ opens the door to explainable social intelligence. The dataset contains rigorously annotated and validated videos, questions and answers, as well as annotations for the complexity level of each question and answer. Social- IQ contains 1, 250 natural in-thewild social situations, 7, 500 questions and 52, 500 correct and incorrect answers. Although humans can reason about social situations with very high accuracy (95.08%), existing state-of-the-art computational models struggle on this task. As a result, Social-IQ brings novel challenges that will spark future research in social intelligence modeling, visual reasoning, and multimodal question answering (QA).
more » « less
Full Text Available
Learning Factorized Multimodal Representations

Tsai, Yao-Hung Hubert; Liang, Paul Pu; Zadeh, Amir; Morency, Louis-Philippe; Salakhutdinov, Ruslan (February 2019, International Conference on Representation Learning)

Learning multimodal representations is a fundamentally complex research problem due to the presence of multiple heterogeneous sources of information. Although the presence of multiple modalities provides additional valuable information, there are two key challenges to address when learning from multimodal data: 1) models must learn the complex intra-modal and cross-modal interactions for prediction and 2) models must be robust to unexpected missing or noisy modalities during testing. In this paper, we propose to optimize for a joint generative-discriminative objective across multimodal data and labels. We introduce a model that factorizes representations into two sets of independent factors: multimodal discriminative and modality-specific generative factors. Multimodal discriminative factors are shared across all modalities and contain joint multimodal features required for discriminative tasks such as sentiment prediction. Modality-specific generative factors are unique for each modality and contain the information required for generating data. Experimental results show that our model is able to learn meaningful multimodal representations that achieve state-of-the-art or competitive performance on six multimodal datasets. Our model demonstrates flexible generative capabilities by conditioning on independent factors and can reconstruct missing modalities without significantly impacting performance. Lastly, we interpret our factorized representations to understand the interactions that influence multimodal learning.
more » « less
Full Text Available
Strong and Simple Baselines for Multimodal Utterance Embeddings

https://doi.org/10.18653/v1/N19-1267

Liang, Paul Pu; Lim, Yao Chong; Tsai, Yao-Hung Hubert; Salakhutdinov, Ruslan; Morency, Louis-Philippe (January 2019, Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies)

Full Text Available
Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization

https://doi.org/10.18653/v1/P19-1152

Liang, Paul Pu; Liu, Zhun; Tsai, Yao-Hung Hubert; Zhao, Qibin; Salakhutdinov, Ruslan; Morency, Louis-Philippe (January 2019, Proceedings of the Annual Meeting of the Association for Computational Linguistics)

Full Text Available
Multimodal Transformer for Unaligned Multimodal Language Sequences

https://doi.org/10.18653/v1/P19-1656

Tsai, Yao-Hung Hubert; Bai, Shaojie; Liang, Paul Pu; Kolter, J. Zico; Morency, Louis-Philippe; Salakhutdinov, Ruslan (January 2019, Proceedings of the Annual Meeting of the Association for Computational Linguistics)

Full Text Available

« Prev Next »

Search for: All records